首页> 外文OA文献 >EmTaggeR: A Word Embedding Based Novel Method for Hashtag Recommendation on Twitter
【2h】

EmTaggeR: A Word Embedding Based Novel Method for Hashtag Recommendation on Twitter

机译:EmTaggeR:基于单词嵌入的Hashtag建议新方法   在Twitter上

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The hashtag recommendation problem addresses recommending (suggesting) one ormore hashtags to explicitly tag a post made on a given social network platform,based upon the content and context of the post. In this work, we propose anovel methodology for hashtag recommendation for microblog posts, specificallyTwitter. The methodology, EmTaggeR, is built upon a training-testing frameworkthat builds on the top of the concept of word embedding. The training phasecomprises of learning word vectors associated with each hashtag, and deriving aword embedding for each hashtag. We provide two training procedures, one inwhich each hashtag is trained with a separate word embedding model applicablein the context of that hashtag, and another in which each hashtag obtains itsembedding from a global context. The testing phase constitutes computing theaverage word embedding of the test post, and finding the similarity of thisembedding with the known embeddings of the hashtags. The tweets that containthe most-similar hashtag are extracted, and all the hashtags that appear inthese tweets are ranked in terms of embedding similarity scores. The top-Khashtags that appear in this ranked list, are recommended for the given testpost. Our system produces F1 score of 50.83%, improving over the LDA baselineby around 6.53 times, outperforming the best-performing system known in theliterature that provides a lift of 6.42 times. EmTaggeR is a fast, scalable andlightweight system, which makes it practical to deploy in real-lifeapplications.
机译:主题标签推荐问题解决了基于帖子的内容和上下文来推荐(建议)一个或多个主题标签以显式标记在给定社交网络平台上发布的帖子。在这项工作中,我们提出了anovel方法来推荐微博帖子(特别是Twitter)的标签。 EmTaggeR方法是基于训练测试框架构建的,该框架建立在单词嵌入概念的顶部。训练阶段包括学习与每个主题标签关联的单词向量,并推导针对每个主题标签的单词嵌入。我们提供了两种训练过程,一种是使用适用于该主题标签上下文的单独单词嵌入模型来训练每个主题标签,另一种方法是其中每种主题标签从全局上下文中获取其嵌入。测试阶段包括计算测试帖子的平均单词嵌入,并找到该嵌入与主题标签的已知嵌入的相似性。提取包含最相似主题标签的推文,并根据嵌入相似性评分对出现在这些推文中的所有主题标签进行排名。对于给定的测试帖子,建议在此排名列表中显示最高的Khashtag。我们的系统产生的F1分数为50.83%,比LDA基线提高了约6.53倍,胜过文献中已知的性能最佳的系统(提升了6.42倍)。 EmTaggeR是一个快速,可伸缩且轻量级的系统,使其可以在实际应用中进行部署。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号